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ABSTRACT 

We describe statistical methods for measuring the exoplanet multiplicity 
function — the fraction of host stars containing a given number of planets — from 
transit and radial-velocity surveys. The analysis is based on the approximation of 
separability — that the distribution of planetary parameters in an n-planet system 
is the product of identical 1-planet distributions. We review the evidence that 
separability is a valid approximation for exoplanets. We show how to relate the 
observable multiplicity function in surveys with similar host-star populations but 
different sensitivities. We also show how to correct for geometrical selection ef- 
fects to derive the multiplicity function from transit surveys if the distribution of 
relative inclinations is known. Applying these tools to the Kepler transit survey 
and to radial-velocity surveys, we find that (i) the Kepler data alone do not con- 
strain the mean inclination of multi-planet systems; even spherical distributions 
are allowed by the data but only if a small fraction of host stars contain large 
planet populations (> 30); (ii) comparing the Kepler and radial- velocity surveys 
shows that the mean inclination of multi-planet systems lies in the range 0-5 de- 
grees; (iii) the multiplicity function of the Kepler planets is not well-determined 
by the present data. 



1. Introduction 

The distribution of inclinations in multi-planet systems provides fundamental insights into 
planet formation. The small inclinations of the planets in the solar system — the largest 
is 7°, for Mercury — strongly suggest that they formed from a disk. However, we should 
not be surprised if extrasolar planetary systems have larger inclinations, for several rea- 
sons: (i) the rms inclinations in the asteroid and Kuiper belts are substantially larger, 12° 
and 16° respectively; (ii) in most astrophysical disks, the rms eccentricity and inclination 
are correlated, and the eccentricities of extrasolar planets are much larger than those of 
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solar-system planets (0.23 for exoplanets with periods greater than 10 days, compared to 
0.05); (iii) a number of dynamical mechanisms can excite inclinations, including Kozai- 
Lidov oscillations, planet-planet scattering, and resonance sweeping; (iv) m easurements of 
the Rossiter-McLaughlin effect in transiting systems (e.g., IWinn et al.ll2010l ) show a broad 



distribution of obliquities (angle between the spin axis of the host star and the orbit axis 
of the planet) and some processes that excite obliquities do so by exciting inclinations; (v) 
most extrasolar planetary systems have quite different configurations from the solar system, 
so they may form by quite different mechanisms; (vi) there are still serious theoretical obsta- 
cles to the formation of planets from a circumstellar disk, and several authors have suggested 
that some or all planets may be formed by other mechani sms, more s i milar to star formation, 
that would impart large inclinations to the plan ets (e.g., 
200ll ; iRibas fc Miralda-Escudell2007i : [Abtlboioh . 



Blacklll997l ; iPapaloizou &: Terquem 



There is only fragmentary evidence that extrasolar planetary systems have small relative 
inclinations: 



The mutual inclination of planets B and C in the sy stem surrounding the pulsar 
B 125 7+ 12 is less than ~ 13° (IKonacki fc Wolszczanll2003l ); this result is only marginally 
relevant to planetary systems around main-sequence stars since pulsar planets must 
have had a very different history. 



Using radial-velocity and astrometric data, iBean fc Seifahrtl ( 120091 ) estimate that the 
mutual inclination between GJ 876 b and c is 5.0° + ^ . Using radial velo cities and 
dynamical modeling of the planet-planet interac t ions ICorreia et al.l ( 120101 ) conclude 



that the mutual inclination is < 2°, while iBaluevi ( 1201 ll ) finds that the same quantity 



is between 5 and 15°. The large scatter among these results means that they should 
be used with caution. 



McArthur et al.l (120 lOf ) find from astrometric and radial-velocity measurements that 
the mutual inclination of v And c and d is 30° ± 1°, much larger than in GJ 876 but 
still small enough to suggest formation from a disk. 

Dynamical fits to the transit timing of two pla nets in the Kepler-9 system yield an 
upper limit to the mutual inclination of ~ 10° (IHolman et al.l 120101 ). However, this 



system was discovered in a transit survey, and such surveys are far more likely to detect 
multi-planet systems with small inclinations rather than large ones. 



Lissauer et al.l (J2011aJ) studied the six-planet system Kepler- 11 and concluded that the 
absence of transit duration changes in Kepler-lle implies that its inclination relative 
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to the mean orbital plane of other planets is less than 2 degrees^]; once again, this result 
is biased by the strong dependence of the probability that two or more planets will 
transit on their mutual inclination. 



As noted above, if one planet in a two-planet system transits its host star as viewed 
from Earth, the probability that the second planet will also transit is higher if t he mutual 
inclination of the two planetary orbits is small (e.g., iRagozzine fc Holmanll2010l ). This ar- 
gument suggests that the numbers of 1-planet, 2-planet,. . .,iV-planet systems detected in a 
large transit survey contain information about both the multiplicity function — the fraction of 
host stars containing 0, 1, 2, . . . , N planets — and the inclination distribution. The challenge 
is to disentangle the two distributions to distinguish thick systems with many planets from 
thin systems with few planets. 



The first attempt to do this was made by iLissauer et al.l ( 1201 lbl ). who modeled the 
number of multiple- planet s ystems detected in the first four months of data from the Kepler 
survey (IBorucki et al.ll201ll ) — 115 with two transiting planets, 45 with three, 8 with four, and 
one each with 5 and 6. Lissauer et al. used a variety of simple models for the distribution 
of the number of planets per system. They found that none of their models fit the data 
well, mostly because they produced too few systems in which a single transiting planet was 
observed, but that the best-fit models typically had mutual inclinations < 5°. 

The purpose of this paper is to develop a general formalism that relates the intrinsic 
properties of multi-planet systems to the properties of the multi-planet systems that are 
detected in transit or other surveys (£j2]and $3]), and to apply this formalism to the Kepler 
planet survey (§HJ) and to radial- velocity surveys (§[5]). Previous analyses have used Monte 
Carlo simulations to explore these problems, but our calculations are mostly analytic or 
semi-analytic and do not employ Monte Carlo methods. 



1.1. Preliminaries 

First we introduce some notation, (i) The Kepler team uses the term planet "candidate" to 
denote a possible planet that has been discovered through transits but not yet been confirmed 



1 Lissauer et al. also concluded that the mean mutual inclination of the planets was 1-2° from Monte 
Carlo simulations of the probability that a randomly placed observer would see transits of all the planets; 
however, this conclusion is suspect since the probability that a random star with six planets would show 
six transits is different from the probability that one star from the Kepler sample of ~ 150,000 stars would 
show six transits. 
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by radial- velocity measurements. iMorton fc Johnson! (120111 ) estimate that 90% to 95% of the 



Kepler planet candidates are real planets, so for the remainder of this paper we will simply 
assume that all the Kepler planet candidates are real and delete the word "candidate", 
(ii) We must constantly distinguish between the number of planets in a system and the 
number of transiting planets in that system. We use the contraction "tranet" to denote 
"transiting planet" . Thus o ne could have, for example, a two-tranet, three-planet system 



( iRagozzine fc Holmanll2010l call this a "double-transiting triple system"), (iii) We distinguish 
two types of selection effects that limit a planet sample. Every survey has a set of detection 
thresholds, determined by the parameters of the survey, that limit the properties of the 
planets that it can detect (maximum orbit period, minimum reflex radial velocity, minimum 
transit depth, etc.). A survey selection effect is a limitation on the number of detectable 
planets due to the detection thresholds. A geometrical selection effect is a limitation arising 
from the orientation of the planetary system — in particular, the planet must cross in front 
of the stellar disk to be detectable in a transit survej|§. 

We assume that the stars in a survey may have 0,1, . . . , K planets and denote the 
number of stars in the survey with k planets by A^. Thus Ylk=o^ k * s ^ ne total number of 
stars in the survey. The vector N = (A , Ni, . . . , Nx) is called the multiplicity function. 

Because of survey and geometric selection effects, only a fraction of these planets will 
be detected in the survey. Let the survey selection matrix element Skm be the probability 
that a system containing m planets has k of them that pass the survey selection criteria. 
Similarly, let the geometric selection matrix Gjk be the probability that j of these k planets 
pass the geometric selection criteria. Then the expected number of systems that the survey 
should detect with j tranets is 

K K 

n 3 =^G jk ^S km N m , or n = G ■ S ■ N. (1) 

k=j m=k 

We call n the observable multiplicity function. Clearly 

Gran = S mn = for 771 > 71, Gqq = Soo = 1, G mn , S mn > 0. (2) 

Moreover since the number of detectable planets in an n-planet system must be between 
and n, we have 



n 



G m n ^ S mn 1. (3) 



m=0 m=0 



2 Thcre is also a geometrical selection effect in radial- velocity surveys, since the reflex velocity is propor- 
tional to sin 7 where 7 is the inclination of the planetary orbit to the line of sight. However, we can eliminate 
this effect by working only with the minimum mass Msm.7 where M is the planet mass; of course, for transit 
surveys sin 7 ~ f so the minimum mass equals the mass. 
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Thus G and S are (K + 1) x (K + 1) upper-triangular stochastic matrices. For physical 
reasons G and S should commute (eq. [I] should not depend on whether we consider the 
survey selection effects or the geometric selection effects first). We have confirmed that the 
commutator [G, S] is indeed zero for the selection matrices that we derive below. 



1.2. Separability 

Let w represent all of the orientation-independent properties of a planet and its host star that 
determine its detectability (planetary mass and radius; stellar mass, radius, distance, and 
luminosity; orbital period, etc.) and let /(w 1; . . . , w n ) represent the probability distribution 
of these parameters for an n-planet system. Thus J dw± ■ ■ ■ dw n /(wi, . . . , w„) = 1. 

A natural assumption for describing multi-planet systems is that the n-planet distribu- 
tion function is separable, that is, 

n „ 

/( Wl ,...,w n )= ]Jf(w m ), / dw/(w) = l. (4) 

m=l 

This assumption can only be approximately valid — for example, it is inconsistent with the 
observational finding that planets tend to be concentrated near mutual orbital resonances, 
and with the theoretical finding that planets separated by less than a few Hill radii are 
unstable. Nevertheless, we argue that the separability assumption is sufficiently accurate to 
provide a powerful tool for analyzing the statistics of multi-planet systems. We describe the 
evidence on its validity in §3.21 



2. Survey selection effects 

Let O j4 (w) be the probability that a planet with properties w is detected in the survey 
labeled by A if its host star is on the target list for this survey and the orientation of the 
observer is correct (we assume that whether or not a planet can be detected is independent 
of the presence or absence of other planets in the same system, which is a reasonable first 
approximation). Thus the function j4 (w) describes the survey selection effects for A, but not 
the geometric selection effects. The probability that a planet is detected, ignoring geometric 
selection effects, is then 

W A = [ f(w)B A (w) dw. (5) 
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If the survey target list contains N A stars with m planets, then using the separability as- 
sumption (j3J) the expected number of systems in which k planets will be detected is 



A' 



n A 



Y,S km (W A )Nt 0<k<K; (6) 



m=k 



where the survey selection matrix S is a (K + 1) x (K + 1) matrix whose entries are given 
by the binomial distribution, 



T71 ' 

S km (W) = ———W k (l - W) m ~\ 0<k<m<K, (7) 
k\[m — k)\ 



and zero otherwise. Note that S(l) is the unit matrix. A useful identity is (jStruml 11972 
e.g.,) 

S(A) ■ S(B) = S(AB), (8) 

which in turn implies 

S-\W) = S(W- 1 ). (9) 

Although the physical motivation (jSj) for the definition of S(W) requires < W < 1, the 
matrix is well-defined for all values of W. 

With the assumption of separability it is straightforward to show that the conditional 
probability distribution of the parameters w m , given that k planets are detected, is (cf. eq. 

SD 

k 

/( Wl ,...,W fc )= l[f(w m ). (10) 

m=l 

Thus a separable distribution is still separable after survey selection effects are applied, so 
long as the selection effects depend only on the properties of an individual planet. 

The factor W (eq. EJ) is usually difficult to determine reliably since (i) we do not have 
good models for the distribution /(w) of the planetary parameters; (ii) in most cases the 
survey selection effects 0(w) are not known accurately; (iii) in many cases the target list 
from which a given sample of exoplanets was detected is not even known (the Kepler survey is 
an exception to the last two limitations). However, useful results can be obtained without an 
explicit evaluation of W. Suppose, for example, we have two surveys A and B that examine 
populations of target stars with similar characteristics; then the ratio of the number of m- 
planet systems in the target populations of the two surveys should be independent of m, so 
N^l = cN A where c is a constant given by the ratio of the number of target stars in B and 
A. Equation (jSJ) can then be written 

n A = S(H/ A )N A , n B = cS(W B )N A . (11) 
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Applying equations (jHJ) and (jHJ), we have 

n B = cS(/ B V (12) 

where f BA = W B /W A = 1/ f AB . Thus the observable multiplicity function n s of survey B 
is directly related to that of survey A by a matrix that depends only on a single parameter 
f BA (the normalization constant c is known, since it is just the ratio of the number of target 
stars in the two surveys). The parameter f BA can be eliminated if we plot n B ,nf, ... as 
functions of nf . In practice we must use the multiplicity function n A rather than IT 4 on 
the right side of equation (fT2l but these should not be very different so long as 1. 
Equation ffTTj) is well-defined whether f BA is larger or smaller than unity, but if f BA > 1 
the statistical errors will be amplified and it is likely that some of the predicted values of 
n B will be negative, which is unphysical. Thus, if the separability approximation is valid, 
the observable multiplicity function of deep surveys can be used to predict the observable 
multiplicity function of shallow surveys (but not vice versa). 



Exa mple To illustrate this procedure, we examine the Kepler catalog of iBorucki et al. 



( 201ll ). trimmed by 20% as described at the start of §H to produce a more homogeneous 



set of target stars. This is catalog A. All of the planets in this catalog are detected with a 
signal/noise ratio (SNR) of at least 7. We construct a sequence of shallower catalogs (catalogs 
"B") by gradually increasing the minimum SNR up to values exceeding 100, at which point 
only a handful of multi-planet systems is left. The relation (JT2"|) implies that apart from 
statistical fluctuations the numbers of multiple-planet systems n B , k = 1, . . ., are functions 
only of f BA and the known n A , which approximates the observable multiplicity function TF 4 . 
Hence by eliminating f BA in favor of nf , the number of fc-planet systems in any survey B 
can be predicted as a function of the number of one-planet systems in that survey. These 
predictions for k = 2,3,4 are shown in the upper left panel of Figured] as solid lines, along 
with the I— a confidence bands (dashed lines). The actual numbers of multi-planet systems 
after SNR cuts on the Kepler data are shown as open circles. Within the statistical errors 
the predictions agree with the data for k = 2 and 3 and are marginally consistent for k = 4: 
using a Kolmogorov-Smirnov (KS) testa, the p-value (probability of observing deviations at 
least as extreme as those seen, given the null hypothesis) is 0.28, 0.27, and 0.06 respectively. 

The upper right panel of Figure [T] shows a similar comparison for a sequence of catalogs 
based on cuts at increasing planet radius, rather than SNR. The results are consistent with 



3 The use of a KS test is not strictly applicable since and n\ are cumulative distributions of a third 
parameter, the SNR, rather than being directly related. However, the results should be approximately correct 
when ri\ 3> which is usually the case. 
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Fig. 1. — The observable multiplicity function for subsets of the Kepler and radial- velocity 
planet samples (top and bottom panels, respectively). The catalog subsets are defined by 
imposing cuts based on signal-to- noise ratio (SNR), planet radius, orbital period, or velocity 
semi-amplitude (K-psy). Open circles show n 2 , n 3 , n 4 (numbers of 2, 3, and 4-planet systems) 
as a function of n\. Solid and dashed curves show the predictions of equation (I12j) and the 
l-O" errors on the predictions. 
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the separable model to within the statistical errors (p-values of 0.72, 0.19, and 0.66 for 
k = 2,3,4). 

The lower panels of Figure [T] show similar results for radial- velocity surveys. The "A" 
catalog consists of 240 FGK dwarf stars hosting one or more planets (see eq. H6] for more 
detail), and the cuts are based on K^v (semi-amplitude of the radial- velocity curve) on the 
left and orbital period on the right. The predictions are marginally consistent with the null 
hypothesis (p- values between 0.03 and 0.10) except for n 2 as a function of the cut in K R y, 
for which the null hypothesis is excluded. 

These results confirm that in many cases the separability approximation and equation 
(j!2p provide useful tools for removing survey selection effects and converting the observable 
multiplicity function between surveys. 



3. Geometric selection effects in transiting systems 



Throu ghout this paper we shall assume that tranets are in circular orbits. iMoorhead et al. 



(120 111 ) estimate that the mean eccentricity of planets discovered in the Kepler survey is only 
0.1-0.25, so this assumption should not cause significant errors. We shall also assume that 
a transit occurs when the line of sight to the center of the planet intersects the stellar disk. 
This assumption should be approximately correct so long as the planetary radius is much 
smaller than the stellar radius (the median ratio of planetary radius to stellar radius in the 
Kepler survey is only 0.026). 

Let R* be the radius of the star, a the semi-major axis of a planet in a circular orbit, 
and e = R*/a. Consider a system containing n planets with semi-major axes specified by 
ei, . . . , e n . Let g m „(ei, . . . , e n ) be the probability that a randomly oriented observer will 
detect m tranets in this system. 



One planet First consider the case n — 1. We define three unit vectors: 6 points towards 
the observer, n is normal to the planetary orbit, and z is normal to the reference plane from 
which inclinations % are measured. Thus z • n = cosi and 6 • n = cos 7. If the planet's size 
is negligible, it transits if and only if |6 • n| < e or | COS7I < e so 

f, 1 . sin 70^7 

/ \ -1 / \ J \ C0S7 <e ' ' /1 o\ 

0ii(e) = 1 - 001(e) = ', = e. 13 

J sin 7 cry 
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Two planets Let h(w) — 1 if \w\ < 1 and zero otherwise. Then transits occur if and only 
if /i(e _1 cos 7) is unity and we may write 

00 

hie' 1 cos 7) = be(e)Pi(cos 7) (14) 
e=o 

where Pi is a Legendre polynomial. From the properties of these functions we have 

( e, £ = 0, 

b e (e) = I P m (e) - P*_i(e), £ even, i > (15) 
[ 0, I odd. 

Now let (9, (ft) be the polar coordinates for 6 relative to the polar axis z, and (Q — \n, i) the 
polar coordinates for n. Then 

^- 1 cos 7 )=47T^^l E Y^fiYUhn-fr). (16) 

£=0 m=-l 

Let the probability distribution of planetary inclinations be q{i\n)di, where k is a set of free 
parameters describing the inclination distribution, which we may vary to fit the observations. 
Then the probability of a transit of a single planet, given the observer orientation x = cos#, 
is 

u(x\e,K) = J ^ ? (^)M6- 1 cos 7 ) =4^^^ J diq(i\K)Y; o (e,0)Y eo (i,0) 

00 

= Y t Q t (K)b e (e)P e (x). (17) 

e=o 

where 

Q^(re) = / diq(i\n)Pe(cosi), Q = 1. (18) 

If a system contains two planets, the probability that both transit for a random orien- 
tation of the observer is 

922(^1, £2, «) =| /_! dxw(x|ei, K)u(x|e 2 , k) 

= \ E^n=0 M e l)M e 2)Q*(«)Qn(«) ^ P e (x)P n (x) 

= f:$^ b ^iMe 2 ). (19) 
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Moreover the probability that one and only one of the two planets transits is 

012 (ei, 62, «) =| /_! {ix(x|ei, k)[1 - u(x\e 2 , k)] + [1 - w(x|ei, «)]w(x|e 2 , /«)]} 

=011(61, k) + 0u(e 2 , k) - 2322(61, e 2 , rc) (20) 

and the probability that no planets transit is 

002(61, e 2 , k) =1 - 0i 2 (ei, e 2 , k) - 22 (ei, e 2 , k) 

= 1 - 5-11(6!, k) - 0n(e 2 , k) + 22 (ei, e 2 , k). (21) 

For example, if the planets are distributed isotropically then q{i)di = | sinz cfi, Qi = Sio 
and 322(61, 62) = 6162- If the planets have zero inclination, it can be shown that 

/ \ ^(ei)^(e 2 ) . , , . . 

022(61,62) = 2^ n x = mm(e 1 ,6 2 ), (22) 

£=0 

although this result is derived more easily in other ways. 

Three or more planets These results can be extended to any number of planets^: 

0mn(6l,...,6 n ,K) = \ $\dx J2p n Il7=l U ( X \ e Pii K )Hj=m+l[ 1 ~ u ( x \ e Pv «)]» ( 23 ) 

where P n is the set of all permutations (p±, . . . ,p n ) of the numbers 1, . . . , n, and m < n. For 
example, 

023(61, e 2 , e 3 , k) = 22 (ei, e 2 , n) + 02 2 (e 2 , e 3 , k) + 022(63, £i, «) - 303 3 (ei, e 2 , e 3 , k) 
0i3(ei, e 2 , e 3 , k) = 3n(ei, k) + 0n(e 2 , k) + 0n(e 3 , k) - 2g 22 (e 1 , e 2 , k) - 25-22(62, e 3 , k) 

- 2322(63, ei, #c) + 33-33(61, e 2 , e 3 , #c). (24) 

The geometric selection matrix G mn (eq. [I]) is simply (g m n(R*/a>i, R*/a 2 , • • • , R*/ai, 
the average of the geometric selection factor over the joint distribution of stellar radius R* 
and planetary semi-major axis a for the survey. To evaluate G mn (n) we use the separability 
assumption (TJJ with respect to e = R+/a. Thus 

/n 
3 mn (ei, . . . , e n ,K) J]/(e fc )dloge fe , (25) 
k=i 



For n = 3 the functions g mn can be expressed as series in the Wigner 3-j symbols, but in practice it is 
simpler to evaluate the integral (|23p numerically for any n > 2. 
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where /(e)dloge represents the probability distribution of e as modified by the survey selec- 
tion effects. 

With this parametrization and equations (|17p and ( 123"|) it is straightforward to show 
that G mn (n) is given by the binomial distribution, 

G mn {K) = 2ml{n _ m y J dxU m {x\K)[l - U(x\n)] n - m = \ J^dxS mn [U(x\K)\ (26) 
where S mn is given by equation ([7]), 



CO 



/ / (e)u(x\e, k) d log e 
//(e)dlog. 



U ( x \*) = r*,l^L. = l^QMB t P t {x), (27) 



and 



B e = J f{e)b t (e)d\oge with J /(e) dloge = 1. (28) 

Since i?^ does not depend on the unknown parameters n of the inclination distribution it can 
be evaluated once and for all at the start of any optimization procedure. It is straightforward 
to show that the relations (j2J) are satisfied by these formulae, and that the matrices G 
and S commute. In numerical work we typically truncate infinite series such as (|27|) at 
& = Cax — 50, but for very thin disks it may be necessary to include terms of higher i. 

We pointed out in equation (ITU]) that most survey selection effects preserve the sepa- 
rability assumption. This result does not generally hold for geometric selection effects. To 
illustrate this, consider the simple case of a population of stars containing two planets, with 
zero relative inclination. Write the probability distribution of e = R*/a of two-planet sys- 
tems as /(ei)/(e 2 )dlogeiC?loge2 (after survey selection effects but before geometric selection 
effects). Then using equation (1221) it is evident that the probability distribution of two-tranet 
systems is 

dp 2 (ei,e 2 ) = /(e 1 )/(e 2 )min(e 1) e 2 )cZlogeidloge2, (29) 

which is not separable. Only for isotropic distributions do geometric selection effects preserve 
separability. 



3.1. The inclination distribution 

In this paper we model the probability distribution of the inclinations dp = q{i\K)di as a 
Fisher distribution, 

q(i\n) = . — exp(« cosi) sini. (30) 

A Slllll 
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The parameter k is related to the mean-square value of sin i through 

(sin 2 i) = f di sin 2 i q{i\n) = 2— -. (31) 

J K K 

When k < 1 the Fisher distribution approaches an isotropic distribution, lim^o q(i\n) = 
|sini, while for k ^> 1 it approaches the Rayleigh distribution, Xim^^ q{i\n) = {2i/s 2 ) x 
exp(— i 2 /s 2 ) where s = (2/k) 1 / 2 is the rms inclination and |7r 1/,2 s = 0.8862s is the mean 
inclination. The Rayleigh distribution is commonly used to model the inclination distri- 
bution of asteroids, Kuiper-belt objects, stars in the Galactic disk (where it is known as 
the Schwarzschild distribution), etc. As k — > — oo the Fisher distribution approaches a 
retrograde Rayleigh distribution. 

For the Fisher distribution, equation f JT8|) becomes 

g ^ ) = VT^T (32) 

where I denotes a modified Bessel function. 



3.2. Validity of the separability assumption 

There is limited evidence on the accuracy of the separability approximation for multi-planet 
systems. First consider RV surveys, in which there are no geometric selection effects. The 
most important survey selection effects depend only on the properties of an individual planet 
so an RV survey of a separable parent distribution should lead to a separable detected 
distribution (eq. HU|) . 



Wright et al.l (120091 ) compare 28 multi-planet systems and a much larger number of 
single-planet systems detected by RV surveys. They find that (i) the eccentricities in multi- 
planet systems are smaller (mean eccentricity 0.22, compared to 0.30 in single-planet sys- 
tems); (ii) the logarithmic semi-major axis distribution in multi-planet systems is flatter, 
without the pileup of hot Jupiters between 0.03 AU and 0.07 AU and the enhancement out- 
side 1 AU that are seen in single-planet systems; (iii) multi-planet systems exhibit an over- 
abundance of planets with minimum mass between 0.01 and 0.2 Jupiter masses. These 
differences are incompatible with separability and statistically significant (p < 0.03), but 
relatively small: they represent maximum differences of only 0.18, 0.17, and 0.26 in the 
cumulative p r obabi lity distributions for eccentricity, semi-major axis, and minimum mass. 



Wright et al.l ( 120091 ) point out that the last of these differences may also be amplified by 
an unmodeled survey selection effect — stars hosting planets tend to be observed more fre- 
quently, thereby enhancing the chance to discover additional low-mass planets. Most of the 
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plots in the lower panels of Figure [T] are marginally consistent with separability, as discussed 
at the end of §2j 

The evidence on separability from the Kepler survey is more difficult to interpret, be- 
cause geometric selection effects do not preserve separability (see discussion just before §3. 1[) . 
Nevertheless, the semi-major axis distributions of single- and multiple-tranet systems in the 
Kepler survey are indistinguishable according to a KS test (p- value 0.20; see also Figure EJ, 
which is consistent with separability. Presumably the pileup of hot Jupiters at small semi- 
major axes seen in the RV surveys is less prominent in the Kepler sample because the typical 
planetary mass is much smaller, and the jump outside 1 AU is not seen because Kepler is not 
sensitive to these orbital periods. 



Latham et all (120111 ) have shown that Kepler systems with multiple tranets are less likely 
to include a giant planet (larger than Neptune) than systems with a single tranet. We confirm 
using a KS test that the distributions of radii in the single- and multiple-tranet systems are 
different (maximum difference in the cumulative probability distribution of 0.20). However, 
the results at the end of §2] show that the numbers of two-, three-, and four-tranet systems as 
a function of the radius cutoff appear to be consistent with separability. Evidently equations 
such as pip that we use to compare the observable multiplicity function between surveys 
are less sensitive to deviations from separability than statistical tests designed specifically 
for this purpose. 

These comparisons suggest that deviations from separability, though present in both the 
RV and Kepler planet samples, are not large enough to compromise our method and results. 
However, further exploration of both the magnitude and the effects of these deviations is 
needed. 



4. Estimating the inclination distribution and the multiplicity function from 

the Kepler survey 

4.1. Properties of the survey 



The Kepler survey has a complex set of survey selection effects, which we do not attempt 
to model. The constraints on the multiplicity function that we derive therefore apply to 
the population of planets in radius, semi-major axis, etc. that Kepler detects, whatever 
that popul ation may be (for a disc ussion of select ion effects and completeness in the Kepler 
catalog see iHoward et al.ll201ll and I Youdinl |20 1 it ) . If we denote the multiplicity function of 
this population by N and the observable multiplicity function of the Kepler survey by n 
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then equation ([I]) becomes 

n = G ■ N. (33) 

The validity of this equation requires only the plausible assumption that the probability that 
Kepler will detect a given transiting planet around a given star is independent of whether it 
detects other transits around the same star. 



To produce a more homogeneous sample, we trim the catalog of iBorucki et al.l (120111 ) to 
include only stars with effective temperatures between 4000 and 6500 K and surface gravity 
log g > 4.0 (roughly equivalent to FGK dwarfs), and to Kepler magnitudes between 9.0 and 
16.0; this trimming leaves 124,613 stars from the original sample of 153,196. We also restrict 
the catalog to planets with orbital period less than 200 d and radius less than 2 Jupiter 
radii; this leaves 1092 planets from the original sample of 1235. The numbers of stars with 
0, 1, 2, . . . tranets are 

n = 1.237 x 10 5 , ni = 737, n 2 = 104, n 3 = 37, 

n 4 = 7, n 5 = 1, tie — 1, n& = for k > 6. 

We need to determine the function /(e), where /(e) cHoge is the fraction of planets in 
the range rfloge given the intrinsic distribution of planets and the survey selection effects 
for Kepler. As usual e = R*/a is the ratio of stellar radius to planetary semi-major axis; the 
stellar radius is determined from the host-star mass and surface gravity and the semi-major 
axis is determined from the host-star mass and the planetary orbital period. Figure [2] shows 
data points for /(e) from single-tranet systems (red points) and from planets in multi-tranet 
systems (blue points). The data points have been constructed by adding a contribution of 
e _1 (to account for geometric selection effects) from each tranet to the corresponding bin, 
then normalizing so that the integral over loge is unity. The distributions for single-tranet 
and multi-tranet systems are quite similar, and can be adequately fit by the parametrization 

/(e) = 0.656 1 ' x , fi , e = 0.055, for e > 0.004 (35) 
jyj l + (e/e ) 3 - 6 ( J 

and zero for e < 0.004. Th e sharp decline for e > 0.1 is due to an absence of planets with 



semi-major axis < 0.04 AU ( IBorucki et al.l 1201 ll ). while the cutoff at e < 0.004 is due to the 



limited timespan of the Kepler data. 
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Fig. 2. — The probability distribution of e = R*/a, the ratio of stellar radius to planetary 
semi-major axis, for tranets detected by Kepler. The differential probability distribution is 
dp = /(e) dloge. The data points for single- and multi-tranet systems are shown separately. 
The solid line shows the analytic fitting formula fl35l) . 
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4.2. Statistical method 

The probability that the survey actually detects {n , m, . . . , nx} stars having 0,1, ... ,K 
planets is 

P(n|N, K ) = n^4^) (36 ) 

Tlk. 

k=0 K 



where n = (no, n\, . . . , tik) and nk is related to N by equation fl33|) . 

Estimating the multiplicity function N and the inclination distribution parameters k 
from n is a straightforward but challenging problem in statistics and optimization. This 
problem can be attacked with a variety of methods (linear programming, minimum % 2 , 
maximum likelihood, Bayesian analysis using a Markov chain Monte Carlo algorithm, etc.), 
and we have experimented with most of these. In this paper we have usually chosen maximum 
likelihood, as a reasonable compromise between generality, computation time, and clarity of 
interpretation. 

The log of the likelihood of a given observational result n is 



K 



logP(n|N, k) = ^ n k log 



fc=0 



K 



l=k 



K K K 



-EE - E lo s nkl ( 3? ) 



fc=0 l=k k=0 



Note that the second term on the right can be simplified to Yli Ni using equation ([3]). We then 
maximize logP with respect to N and n, subject to the constraint Nk > 0, k = 0, . . . , K. 



4.3. Results 

The top panel of Figure |3] shows the maximum likelihood as a function of the rms inclination 
and the maximum number of planets per system, K, for 6 < K < 40. The minimum allowed 
value is K = 6 since Kepler has found one system with six tranets. The maximum-likelihood 
models with a given K are connected to form solid lines, and the families with K = 10, 20, 
30, and 40 are colored for emphasis. There are occasional small dips in the lines when the 
optimization algorithm (a quasi-Newton algorithm from NAG) converged on a local rather 
than global maximum. The figure shows that: 

(i) The highest likelihood is for razor-thin systems, with near zero rms inclination. However, 
the preference for zero rms inclination has only marginal statistical significance: systems 
exist at all rms inclinations — even isotropic systems — with log likelihood only 0.73 smaller 
than the razor-thin solutions. 



0.2 0.4 0.6 0.8 

<sin 2 i>'/ 2 




Fig. 3. — (top) The maximum likelihood of solutions for the multiplicity function of the 
Kepler survey, as a function of rms inclination and maximum number K of planets per 
system. Solid lines connect solutions with a given K, 6 < K < 40; lines for K = 10, 20, 30, 40 
are colored cyan, red, green, and blue for emphasis. The vertical dashed line denotes isotropic 
planetary systems. The horizontal dashed line marks systems that are 3-er (AlnL = 4.5) 
lower in likelihood than razor-thin solutions, (sin i) = 0. (bottom) Plots of x 2 (eq. EHD for 
the maximum-likelihood models shown above. We estimate that models with \ 2 J$ 5 are 
good fits to the data. 
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(ii) Systems with large rms inclinations are only consistent with the data if a fraction of them 
contain a large number of planets. At the 3-a level (log likelihood smaller than the maximum 
by 4.5, marked by a horizontal dashed line on the figure), the maximum rms inclination is 
related to the maximum number of planets by 

(sin 2 z)V 2 </ 0.15 + 0.037(1,- 6), K < 24 
(sin z) < | ( 2 )1/2 (igotropic); K > 24 

It is possible, of course, that even the maximum-likelihood model does not fit the data 
well. To explore this possibility, we have calculated the standard Pearson x 2 statistic, 

2 _ ( n k ~ ^kf _s^-{n k - G M Ni) 2 

x ~ = ~ ZL^ n at ■ ^ 

U n k ^ EiGklNi 

The distribution of the x 2 statistic is not straightforward to interpret, since rife < 1 for many 
k and since the number of degrees of freedom is not well-defined. Nevertheless it is probably 
reasonable to expect that there is a good fit to the data if x 2 ?S 5. The values of x 2 f° r the 
maximum-likelihood solutions in the top panel of Figure [3] are shown in the bottom panel 
of that figure. There are satisfactory models with all rms inclinations, but as before such 
models require that some systems contain many planets if the rms inclination is large. 

It is instructive to examine the isotropic solution with K = 30 in more detail (the 
behavior of the isotropic solutions with K > 30 is qualitatively similar). The fraction of 
stars with /c-planet systems is 



N, 



0.944 k = 0, 

0.0065 k — 1, 

k = 2, 

0.0452 k = 3, 

A; = 4,..., 29, 

0.0043 k = 30. 



(40) 



Thus, in this solution, about half of the planets are contained in three-planet systems, 
and the other half in a small population (< 0.5%) of stars with many-planet systems. This 
multiplicity function and inclination distribution are neither unique nor particularly plausible 
but they are consistent with the Kepler data. 

Figure H] shows the fraction of stars in the Kepler sample with 0, 1,2,3,... -planet sys- 
tems, as a function of the assumed rms inclination. The results are for K = 30 but are 
qualitatively similar for larger values of K. Our initial attempts to construct this figure 
were unsuccessful, because the appearance of the figure is very sensitive to cases when the 
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Fig. 4. — The fraction of stars in the Kepler sample containing /c-planet systems, as a function 
of the rms value of smi. The curves are labeled by k for k < 13 and curves with 7 < k < 12 
are dashed. These curves were obtained by linear programming, using the constraint that 
rik must lie within the 90% confidence interval determined through equation ( 1361) . The cost 
function minimized the total number of planets but the result is insensitive to this choice. 
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optimization algorithm settles on a local maximum of the likelihood. To avoid this diffi- 
culty, we re-cast the optimization as a problem in linear programming: we demanded that 
each fik should lie within the 90% confidence interval determined by the Poisson distribution 
( 1361) . and from these solutions we chose the one with the minimum total number of planets 
J2k=i Nk- This specifies a unique solution, if one exists. 

At the smallest inclinations ((sin 2 i) 1 ^ 2 < 0.05) the solution contains a mix of 1,2,3,4, 
and 8 or 9-planet systems. As the rms inclination increases, the mixture becomes strongly 
dominated by 1-planet and rvplanet systems where varies monotonically with the rms 
inclination — for example, = 12 when (sin 2 ^) 1 / 2 ~ 0.3. We caution that these results 
should not be regarded as a prediction of the Kepler multiplicity function for a given rms 
inclination. 

The need for many-planet systems is straightforward to understand. Consider the 
extreme case of an isotropic distribution. Then k = and q(i\n = 0) = -sini; thus 
Qi(n = 0) = 5eo from equation (ITS]) and the orthogonality properties of the Legendre poly- 
nomials. Thus U(x\k = 0) = B (eq. [27]) and using equation 



77 I 

G mn {K) = • B™(1 - B ) n - m . (41) 
ml{n — my. 

If all systems contain n planets, the ratio of the number of m-tranet systems to the number 
of (m + l)-tranet systems is 

G mn m+ll-Bo . . 1 , An . 

— = — , n>m + l. (42) 

<Jm+l,n n — m tf 

Using equations ( l28l) and ( l35l) we find that Bq = 0.0321 for the Kepler survey. From equation 
f l34|) we find ni/n 2 = 7.1 ±0.7. For comparison the ratio G\ n /G2n is less than 7.1 + 0.7 = 7.8 
only for n > 9; thus any population dominated by systems with less than 9 planets will 
overproduce 1-tranet systems relative to 2-tranet systems. Similarly, for the Kepler survey 
n 2 /n 3 = 2.8 ± 0.5, and G 2 „/G 3 n > 2.8 + 0.5 = 3.3 unless n > 30. 

The average number of planets per star from these solutions is shown in Figure [5j This 
result is insensitive to the rms inclination and the maximum number of planets per star (K), 
since it is given simply by the ratio of the total number of planets to the number o f target 



stars , divided by the probability that a single randomly oriented planet will transit (jYoudin 



20111 ). Mathematically, 



(number of planets per star) = ^— L = 0.274. (43) 



The large open circles in Figure [5] show the probability that a system with one, two, 
or three tranets has additional planets. Typically the fraction of one-tranet systems with 
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additional planets is 0.2-0.5, without a strong dependence on rms inclination. For two or 
three tranets the probability that there are additional unseen planets is substantially hig her. 
The additional planets may be detectable by transit timing variations (IFord et al.ll201lf ). 



5. Combining Kepler and radial- velocity surveys 

As described in the Introduction, a comparison of the observable multiplicity functions of 
planetary systems detected by radial velocities and by transits can offer a powerful probe 
of the inclination distribution. The principal obstacle to making this comparison is that 
the masses and orbital periods of the planets detected through these two observational 
techniques are quite different, as illustrated in Figure El and the multiplicity functions in 
these two regions of parameter space are likely to be different. In this section we use the 
separability approximation and the methods of £j2]to overcome this obstacle. 

Suppose that we wish to combine the Kepler survey with a radial-velocity (RV) survey 
(or a set of such surveys). The surveys yield n£ ep and n RV systems containing k planets. We 
assume that both surveys have similar target star populations (we cull the list of target stars 
in both cases to include only FGK dwarfs), with multiplicity function N for Kepler and cN 
for the RV survey, where c < 1 is a constant to be determined. Let S(W Kep ) and S(iy RV ) 
be the survey selection functions. We assume that there are no geometric selection effects 
for the RV surveys (cf. footnote 2). The generalization of equation (|36|) for the likelihood is 

P(n Kep , n RV |N, k) = J] — } Ke P II — RV, ( 44 ) 

fc=o n k CP] - k =i Uk ■ 

where 

n Kcp = G( K )S(iy Kep )N, n RV = cS(iy RV ). (45) 

Notice that the second product in equation (1441) starts at k — 1 since it is difficult to 
determine accurately how many stars have been unsuccessfully examined for planets by 
RV methods (see further discussion below). We then maximize the likelihood (1441) over 
N , Ni, . . . , N K , W Kep , W RV , and c (as shown in §|2J the likelihood actually depends only on 
the ratio W RV /W Kep ). 

We determine the observable multiplicity functio n for RV planets us ing all planets with 



FGK dwarf host stars in the exoplanets.org database (jWright et al.ll201l[ ) as of August 2010 



< v = 162, < v = 24, = 7, nf = 1, nf = 1, nf = for k > 5, (46) 

for a total of 240 planets. The observable multiplicity function for Kepler planets is given 
in equation flMj) . Figure [7] shows the maximum likelihood as a function of the rms inch- 



-23 - 



0. 



0.6 



0.4 



0.2 







— o 



'O 

o 



oo 





o 

o 

"5 o Q n ° ° ° 

ODO o " O O 

° ° ° o 
oo o o a O O G2> u 
o ^ O o 
o o ° o o 

° ° ' °° go o % 
oo ^ ° o °°o 

o O 



o 

o Qx> o 
o 








1 transit 

2 transits 

3 transits 



0.2 



0.4 
<sin 2 i> ! / 2 



0.6 



Fig. 5. — The horizontal blue line, composed of ~ 7000 points from individual maximum- 
likelihood models, shows the average number of planets per star in the Kepler sample, as a 
function of the rms inclination and the maximum number of planets per star, 11 < K < 40. 
The large open circles show the probability that a system exhibiting one, two, or three 
tranets has additional planets. 
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Fig. 6. — The orbital periods and masses of the planets detected by Kepler (green) and by 
ground-based radial- velocity surveys (red). Orbital periods are in days and masses are in 
Jupiter masses. Masses M for trans iting planets are computed from radii R using M = 
(-R/i? e ) 2 06 M e (ILissauer et al.ll2011bl . for a more accurate relation see eq. H?]) and masses for 
radial- velocity planets are minimum masses Msin7. 
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nation and the maximum number of planets per system, K (top), as well as x 2 f° r these 
models (bottom). The plots are noisier than Figure [31 presumably because the optimization 
algorithm was less successful at finding the global maximum likelihood, but otherwise look 
similar. In particular, systems with large rms inclinations are consistent with the data if and 
only if they contain a large number of planets. Evidently adding data from RV surveys has 
not significantly tightened the constraints on the inclination distribution. 

We now show that adding information on the total number of target stars in the RV 
surveys does allow the inclination distribution to be determined. Figure [H] shows the expected 
numbers %" ep and n^ v of fc-tranet systems from the Kepler survey and fc-planet systems from 
the RV surveys, as determined from the maximum-likelihood solutions described above. Each 
point corresponds to a given maximum number of planets (6 < K < 40) and rms inclination, 
and only solutions within 3-<r of the global maximum likelihood are shown. The points with 
error bars (surrounded by circles for greater visibility) correspond to the observed numbers 
n^ ep and n^ v from equations (1341) and (H6|) . Most of the expected values lie within the 
error bars of the corresponding observed value; this is no more than a confirmation that our 
optimization code is performing properly. The blue points show the total number of stars in 
the RV survey, nf^ = ^fL ^fc V , as determined by the optimization code. The plot shows 
that is tightly correlated with the rms inclination, so an accurate characterization of the 
total number of RV target stars would enable the determination of the rms inclination. 

This task is challenging given the heterogeneous surveys that have produced the RV 
planets known at the present time. We have used two distinct approaches, which we now 
describe. 



(i) ICumming et al.l ( 120081 ) carry out a careful examination of selection effects in the Keck 
Planet Search, and derive the percentage of F, G, and K stars with a planet in various 
ranges of orbital period and mass. The sample of RV planets used in our analysis (eq. HH]) is 
not corrected for selection effects, but for sufficiently massive planets and sufficiently short 
orbital periods it should be complete. For example, for planets more massive than Jupiter, 
Mshi7 > Mj, with orbital periods less than one year, P < 1 yr, the velocity semi-amplitude 
-^rv > 30 m s -1 , large enough to be detectable in m ost surveys. In this m ass and period 



range our sample contains 46 planet-hosting stars and lCumming et al.l (j2008f ) estimate that 
the fraction of stars with planets is 0.019 ± 0.007, which implies = 2400 ± 900. Altering 
the period range to P < 100 d gives = 2500 ± 1200 (based on 21 host stars); altering 
the mass cutoff to Msin7 > 0.5Mj gives = 1900 ± 500 (based on 63 host stars). This 
last estimate of is probably low because the surveys we have used are not all complete 
at this level. 

(ii) We may estimate using the tranet frequency derived from the Kepler mission. 
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Fig. 7. — As in Figure [31 except the data include both the Kepler transit survey and radial- 
velocity surveys. 
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Fig. 8. — The expected numbers of 0,1,2,3 tranet systems from the Kepler survey and of 
1,2,3 planet systems from RV surveys, as predicted by our models. The observed numbers 
are shown as error bars surrounded by circles. Also shown is the total number of targets in 
the RV surveys as predicted by our models (blue points). 
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Fig. 9. — The estimated number of host stars in RV surveys, as determined by comparison 
with the Kepler survey. The curves and associated error bars show the number of RV host 
stars as estimated by comparing the number of RV and Kepler planets with period less 
than P and mass exceeding that required to induce a given velocity semi- amplitude K^y at 
period P. The observed number of Kepler planets is multiplied by (eq. [13]) to correct for 
geometric selection effects, and the conversion between radius and mass is given by equation 
( H7|) . Results are shown for four semi-amplitudes, K R y = 25, 20, 15, 10 m s -1 ; the plot at the 
smallest semi-amplitude is low because the RV surveys are incomplete at this level. 
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Once again, we restrict the Kepler sample to host stars that are F, G, and K dwarfs 
(4000 K< T c g <6500K and log g > 4). We then carry out the following steps for a given 
orbital period P and velocity semi-amplitude K^y: (i) compute the corresponding mass 
M(P,Ksy) = Mj(K^y/30m s _1 )(lyr/P) 1//3 assuming a circular orbit and a solar-mass host 
star; (ii) find the number n RV (P, -Krv) of RV planets with period less than P and mass 
greater than M(P, -Krv); (hi) find all Kepler tranets with mass greater than M(P, K^y) 
and period less than P, using an empirical mass-radius relation found by fitting mass and 
radius measurements from transiting planets in the range 0.1— 10Mj (see Figure ITU]) to a 
log-quadratic relation 

log R/Rj = 0.087 + 0.141 log M/Mj - 0.171 (log M/Mjf ; (47) 

(iv) compute the total number of Kepler planets in this range n Kep (P, K^i) by counting 
each tranet as e -1 planets, to correct for geometric selection effects (eq. [d]); (v) estimate the 
total number of RV host stars as = n^ p n RV (P, K)/n Kcp (P, K). The results are shown 
in Figure |9] for K-gy = 10,15,20,25 ms -1 . As the majority of RV surveys have reached 
precisions of ~ 15 m s -1 or better over the last decade, it is reassuring but not surprising 
that the estimates of for K^y = 15, 20, 25 m s _1 are consistent. The rise in at small 
periods is likely due to the known discrepancy in hot Jupiter frequency between transit and 
RV surveys (the frequency of hot Jupiters estimated from transit surveys is factor of ~ 2 
smaller than that derived from RV surveys, perha ps because the average metallicities are 



different; see lGould et al.ll2006l ; iHoward et al.ll201lh . 



These independent approaches yield ~ 2500 ±1000 and 

n tot — 3000 ± 1000, respec- 
tively, which are consistent within the errors. The corresponding inclination ranges from 
Figure [8] are < (sin 2 i) 1 ' 2 < 0.08 and 0.02 < (sin 2 ?) 1 / 2 < 0.09 which correspond to an 
rms or mean inclination range of 0-5° (as shown in §3.1[ for a Rayleigh distribution the rms 
inclination is only larger than the mean inclination by 12%, which is much less than the 
uncertainty). 

The success of the separability assumption in modeling survey selection effects (§|2]and 
Fig. d]) suggests that our results should be insensitive to cuts made on the Kepler planet 
candidates. To check t his, we have repeated the analysis for the Kepler sample examined by 



Lissauer et al.l ( 1201 lbl ). who imposed a period cut 3d < P < 125 d, a radius cut 1.5R® < 
R < QR®, and a signal/noise cut SNR> 16, which reduced the number of planets to 63% 
of our sample. We find the mean inclination for this sample to be 0-4°, not significantly 
different from the estimate in the preceding paragraph. 

Although the range of rms inclinations is tightly constrained by this analysis, the 
multiplicity function is not. For example, within 1-cr of the maximum-likelihood model 
(AlogP < 0.5) we have found models that have no 1-planet systems (67% have no planets, 
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29% have 2 planets, and 4% have 13 planets) and others that have no zero-planet systems 
(93% have 1 planet, 2% have 6 planets, and 5% have 25 planets). 

A by-product of this analysis is the ratio W RY /W Kep (eq. 145]) . which measures the 
relative sensitivity of the RV and Kepler surveys. This ratio varies smoothly from 0.5 for 
razor-thin systems to 0.2 for (sin 2 ?) 1 / 2 = 0.1, independent of the maximum number of 
planets in the model. In other words 20-50% of the Kepler planets could have been detected 
in RV surveys. If this ratio can be determined independently by fitting models of the period, 
radius, and mass distributions it will provide a constraint on the rms inclination that does 
not require estimating the total number of RV target stars. 

A weak link in these arguments is the assumption that the population of FGK dwarf stars 
is the same in the Kepler and RV surveys. One sign that these populations are different is the 
higher frequency of hot Jupiters found in RV surveys, as mentioned above. However, we note 
that our two approaches to estimating n™, one using only RV surveys and one comparing 
the Kepler and RV surveys, yield similar answers, which suggests that the estimate of the 
rms inclination that we derive from this answer is insensitive to differences between the host 
stars of the Kepler and RV surveys. 

It is interesting to compare this estimate of the mean inclination to the mean eccentricity 
for Kepler planets. Restricting our sample to planets with minimum mass between 0.01 and 
0.1 Jupiter masses and period P > 10 d (to avoid the effects of tidal circularization), the 
mean eccentricity of planets discovered in RV surveys is 0.15 (we have also excluded planets 
with a reported eccentricity of zero, which may include cases in which no eccentricity was fit). 
These results are rou ghly consistent w i th est imates of the mean eccentricity of Kepler planets 



from transit timing: iMoorhead et al.l ( 1201 ll ) find that the mean eccentricity is between 0.13 



and 0.25 at a Rvalue of 0.05. We have 



(i) (z) 0.15 

!4 = 0.35———. (48 

(e) 3° (e) 1 



Theo retical studies of eccentricity and inclination growth in planetesimal disks (e.g-. llda et al 



19931 ) find (i) / (e) = 0.45-0.5, somewhat larger than this value. A possible explanation is 



that the eccentricities may have been systematically overestimated. IZakamska et al.l (120111 ) 
find that the typical bias due to measurement errors is Ae ~ 0.04 in RV catalogs, and the 
bias in this sample is likely to be higher since the SNR is low for low-mass planets. Possibly 
a similar bias is present in the Kepler measurements of the eccentricity distribution. 

T he Kepler survey can measure transit timing variations of a minute or less in favorable 



cases (IFord et al.ll201ll ). These variations can be used to detect and characterize additional 
planets. Given the rms inclination of 0-0.09 radians that we have derived, roughly 20-30% 
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Fig. 10. — The masses and radii of confirmed transiting exoplanets. The green solid line is 
the log-quadratic fit i n equation (1371). The r ed dashed line is the log-linear fit log(M/M®) = 
2.06bg(i2/i2©) from lLissauer et all (boilbh . 
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of the single-tranet Kepler systems are expected to have additi onal planets (Figure [5]), and 
many of these may be detectable by transit timing variations. iFord et al.l (1201 if ) estimate 
that ~ 10-20% of suitable Kepler tranets show evidence of transit timing variations, and 
this number is likely to increase as the survey duration grows. Figure also shows that the 
fraction of two- or three-tranet systems with additional planets is s ubstantially highe r, and 
strongly dependent on the rms inclination. A preliminary analysis by lFord et al.l (120111 ) yields 
much lower probabilities of 0.1-0.2 for two- and three-tranet systems; such low probabilities 
would be difficult to reconcile with any of our models, whatever the rms inclination may be. 



6. Summary 



We have described a methodology for analyzing the multiplicity function — the fraction of 
host stars containing a given number of planets — in radial-velocity (RV) and transit surveys. 
Our approach is based on the approximation of separability, that the probability distri- 
bution of planetary parameters in an n-planet system is the product of identical 1-planet 
distributions ( §1.21) . Exoplanet surveys show that separability is not precisely satisfied but 
the departures from this approximation are small enough that it provides a powerful tool for 
the study of multi-planet systems. Using this approximation we have shown how to relate 
the observable multiplicity function in surveys with different sensitivities, so long as they 
examine populations of potential host stars with similar properties (§2J). We have also shown 
how to derive the multiplicity function from transit surveys assuming a given form for 
the inclination distribution (the Fisher distribution, §3.ip . Our principal conclusions are: 



1. At present, the Kepler data alone (IBorucki et al.ll201ll ) are not able to constrain the in- 
clination distribution of multi-planet systems without additional assumptions or data. 
In particular, models with all rms inclinations — from razor-thin to spherical — are able 
to reproduc e the observable multip licity function in the Kepler sample. This conclusion 
differs from iLissauer et al.l (j2011bl ). who found that (i) the Kepler data contained an 
excess of single-tranet systems that could not be fit by any of their models; (ii) models 
with mean inclinations exceeding 5° were poor fits to the data. We believe that these 
conclusions reflect the restricted, though plau sible, range of models for the multiplicity 
function considered by ILissauer et al.l (j2011bl ). although their estimated upper limit to 
the mean inclination is entirely consistent with our conclusions below based on other 
methods. 

2. Systems with large rms inclinations are only consistent with the Kepler data if at least 
some of them contain a large number of planets. The relation between rms inclination 
and maximum number of planets is given by equation (1381) . 
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3. In our models, the percentage of one-tranet systems with additional planets is 20-30%, 
and for two- or three-tranet systems this percentage is even higher (Figure [5]). These 
fractions can be probed observationally using transit timing variations. 

4. The rms inclination can be constrained by combining estimates of the observable mul- 
tiplicity function from Kepler and RV surveys, but only after estimating the effective 
number of stars that have been examined in RV surveys. We have made two estimates, 
one using Kepler data and one without; these are consistent, and yield (sin 2 i) 1 ! 2 < 0.09, 
corresponding to mean inclinations in the range 0-5°. 

5. Although the range of rms inclinations is tightly constrained by this analysis, the 
multiplicity function is not: the data are well- fit by (presumably) pathological models 
containing no zero-planet systems, no one-planet systems, etc. 
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